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Abstract:  This  report  approaches  the  development  of  actionable 
intelligence  for  counterinsurgency  by  drawing  parallels  with  the  study  of 
criminal  events  such  as  homicides,  vehicle  thefts,  and  gang  violence,  and 
by  exploiting  the  methodological  approaches  that  emphasize  spatially 
explicit  information.  This  spatial  analysis  of  crime  builds  on  the  well- 
established  methods  of  spatial  data  analysis  and  spatial  statistics,  and 
applies  these  in  the  context  of  criminal  events  that  happen  at  specific 
locations.  The  theoretical  background  for  these  methods  is  drawn  from 
environmental  criminology.  Methods  are  categorized  into  three  main 
groups:  exploratory  spatial  data  analysis,  explanatory  spatial  modeling, 
and  surveillance/forecasting  techniques.  The  basic  principles  are  outlined 
and  examples  provided  that  illustrate  the  application  specific  techniques 
in  crime  analysis.  An  initial  methodological  template  is  formulated  that 
stresses  the  constraints  imposed  by  the  quality  and  quantity  of  spatially 
specific  information  available  in  a  counterinsurgency  context. 
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1  Introduction 

Background 

Developing  cultural  information  into  cultural  knowledge  for  military  op¬ 
erations  is  predominantly  an  intelligence  activity  that  takes  place  within 
the  military  decision  making  process  (MDMP).  MDMP  includes  mission 
analysis,  which  produces  an  intelligence  assessment,  evaluation  of  courses 
of  action  and  re-evaluation  of  intelligence  assessment.  Intelligence  Prepa¬ 
ration  of  the  Battlefield  (IPB)  is  performed  before,  during,  and  after  the 
mission  analysis  phase  of  the  MDMP.  Recent  Army  field  manuals  and  les¬ 
sons  learned  documents  emphasize  the  role  of  Every  Soldier  as  Sensor 
(ES2)  in  providing  information  for  IPB.  The  incorporation  of  cultural 
knowledge  into  IPB  is  recognized  as  especially  critical  for  planning  and 
implementing  counterinsurgency  operations. 

In  practice,  IPB  involves  collecting  data  manually  or  through  sensors  cou¬ 
pled  with  computer  analysis  by  highly  trained  intelligence  analysts.  The 
products  produced  from  these  efforts  are  routinely  classified  and  subse¬ 
quently  unusable  by  the  tactical  war  fighter  operating  at  the  brigade  com¬ 
bat  team  level.  Beyond  the  brief  cultural  training  that  brigade  combat 
teams  receive  shortly  before  deployment,  there  are  few,  if  any,  resources  to 
draw  on  for  cultural  information  while  in  theater.  Cultural  “knowledge”  is 
gained  through  experience  in  theater.  There  is  little  cumulative  storage  of 
this  information  and  no  formal  process  to  pass  this  knowledge  on  to  the 
next  replacement  unit. 

Objective 

The  goal  of  the  Actionable  Cultural  Understanding  for  Support  to  Tactical 
Operations  (ACUSTO)  project  is  to  provide  a  product  for  enhanced  cul¬ 
tural  understanding  that  will  be  accessible  to  the  tactical  war  fighter  and 
programmable  into  tactical  spatial  objects  for  a  possible  future  web- 
enabled  decision  support  system. 

Approach 

Dr.  Luc  Anselin  Director  of  Geographical  Sciences  at  Arizona  State  Univer¬ 
sity  and  a  National  Academy  of  Sciences  Scholar  in  Geography  a  collabora¬ 
tor  on  the  ACUSTO  project  was  asked  to  provide  an  analysis  of  method¬ 
ologies,  knowledge  systems,  and  spatial  analytical  techniques  in  light  on 
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the  need  for  socio-cultural  content  that  must  be  considered  to  achieve  a 
future  spatial  decision  support  system  to  take  the  MDMP  to  its  next  future 
state  and  provide  the  foundation  for  geographic  evidential  reasoning  mod¬ 
els.  Provided  in  the  following  pages  is  Dr.  Anselin’s  report  documenting 
this  analysis,  which  concludes  with  an  outline  for  a  methodological  tem¬ 
plate  for  a  future  spatial  decision  support  system  to  support  MDMP  and 
Geographic  Evidential  Reasoning  Models. 

Mode  of  technology  transfer 

It  is  anticipated  that  the  use  of  open  source  data  to  provide  cultural  under¬ 
standing  in  the  operational  environment  will  allow  dissemination  of  cul¬ 
tural  knowledge  to  the  lowest  tactical  level.  Once  the  Soldier  possesses  en¬ 
hanced  cultural  knowledge,  this  will  improve  his/her  ability  to  recognize 
and  document  significant  cultural  information.  Thus,  the  quality  of  obser¬ 
vations  by  ES2  regarding  cultural  factors  will  improve. 

This  report  will  be  made  accessible  through  the  World  Wide  Web  (WWW) 
at  URL: 


http://www.cecer.army.mil 
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2  Toward  a  New  Methodological  Template 
for  Spatial  Decision  Support  System 

The  scientific  study  of  insurgency  and  counterinsurgency,  including  the 
broad  category  of  “deadly  riots”  (Horowitz  2001)  is  well  established.  Sev¬ 
eral  historical  conflicts  have  been  examined  in  great  detail  by  social  scien¬ 
tists,  military  historians  and  policy  analysts  (e.g.,  Galuga  1964).  Classic  ex¬ 
amples  are  the  well  documented  analyses  by  the  Rand  Corporation  of  post 
World  War  II  conflicts,  such  as  the  Vietnam  war  (Vietnam,  Laos),  but  also 
insurgencies  in  other  post-colonial  conflicts  such  as  Burma,  Malaya,  Rho¬ 
desia,  the  Philippines,  El  Salvador,  and  Colombia  (for  a  recent  overview, 
see  Long  2006).  Recently,  interest  has  started  to  focus  on  the  information 
requirements  and  capabilities  specifically  targeted  at  counterinsurgency 
(COIN),  and  the  realization  has  gained  ground  that  a  specialized  intelli¬ 
gence  operations  infrastructure  must  be  developed,  different  from  the 
support  of  traditional  warfare  (e.g.,  Gompert  2007,  Libicki  et  al.  2007). 

Specifically,  in  the  context  of  the  conflicts  in  Iraq  and  Afghanistan,  there  is 
a  growing  awareness  that  the  traditional  IPB  needs  to  evolve  significantly 
to  meet  the  challenges  presented  in  4th  generation  warfare  and  a  new  in¬ 
frastructure  for  information  operations  need  to  be  developed.  This  infra¬ 
structure  requires  non-traditional  information  to  be  collected,  relies  heav¬ 
ily  on  human  intelligence,  the  understanding  of  cultural  and  socio¬ 
economic  factors  and  interpersonal  networks,  and  increasingly  employs 
spatially-explicit  data  and  ethnographic  intelligence  (e.g.,  Hammes  2006, 
Renzi  2006,  Zeytoonian  2006,  Baker  2007). 

This  work  approaches  the  development  of  actionable  intelligence  for  coun¬ 
terinsurgency  by  drawing  parallels  with  the  study  of  civilian  criminal 
events,  such  as  homicides,  vehicle  thefts,  and  gang  violence,  and  by  ex¬ 
ploiting  the  methodological  approaches  that  emphasize  spatially  explicit 
information.  This  spatial  analysis  of  crime  (Anselin  et  al.  2000,  Messner 
and  Anselin  2004)  builds  on  the  well-established  methods  of  spatial  data 
analysis  and  spatial  statistics,  and  applies  these  in  the  context  of  criminal 
events  that  occur  at  specific  locations. 

From  a  methodological  perspective,  the  study  of  the  location  of  violent 
events  associated  with  an  insurgency  is  a  special  case  of  “point  pattern 


ERDC/CERL  TR-09-13 


4 


analysis.”  Interest  focuses  on  the  extent  to  which  such  events  cluster  in 
space  and  on  the  locations  where  those  clusters  (or  “hot  spots”)  may  be 
found.  Increasingly,  this  also  include  attempts  at  explaining  why  the  clus¬ 
ters  are  where  they  are  as  a  function  of  covariates  (explanatory  variables) 
that  can  be  readily  measured.  Point  pattern  analysis  has  seen  extensive 
application  in  ecology,  epidemiology  as  well  as  in  crime  analysis  (a  classic 
technical  reference  is  Diggle  2003,  a  more  introductory  treatment  and  ex¬ 
tensive  references  can  be  found  in  Waller  and  Gotway  2004).  Such  analy¬ 
ses  of  point  events  (or  their  aggregates  by  areal  units)  can  be  readily  ex¬ 
tended  to  applications  in  the  context  of  military  conflicts,  such  as 
improvised  explosive  device  (IED)  attacks  (e.g.,  McFate  2005,  Riese  2006, 
Suen  and  Demirci  2006). 

The  remainder  of  the  report  consists  of  five  additional  chapters.  Chapter  3 
(P  5)  gives  a  general  overview  of  the  conceptual  and  methodological  back¬ 
ground  in  a  brief  discussion  of  spatial  knowledge  systems  for  crime  analy¬ 
sis.  This  is  followed  by  reviews  of  three  methodological  approaches  that 
have  seen  extensive  application  in  the  spatial  crime  analysis  literature:  ex¬ 
ploratory  spatial  data  analysis  (Chapter  4,  p  9),  explanatory  modeling 
(Chapter  5,  p  19),  and  surveillance/forecasting  (Chapter  6,  p  23).  These 
topics  are  all  addressed  at  a  non-technical  level;  references  are  provided  to 
the  methodological  literature  for  technical  details  and  to  specific  applica¬ 
tions  in  crime  analysis  for  illustrations.  Chapter  7  (p  25)  concludes  with  a 
discussion  of  an  initial  framework  for  a  methodological  template  to  sup¬ 
port  actionable  intelligence  input  into  geospatial  evidential  reasoning. 
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3  Spatial  Knowledge  Systems  for  Crime 
Analysis 

This  chapter  starts  with  a  brief  overview  of  the  basic  conceptual  frame¬ 
work  behind  environmental  criminology,  i.e.,  the  study  of  criminal  events 
in  which  the  “context”  is  viewed  as  providing  important  insight  (e.g.,  as 
compared  to  a  focus  on  the  individual).  Next,  some  important  aspects  of 
data  integration  are  discussed,  specifically  with  respect  to  the  accuracy  of 
spatially  explicit  information.  Finally,  some  remarks  are  formulated  on 
knowledge  systems  in  support  of  crime  analysis  and  how  the  various  ana¬ 
lytical  techniques  fit  into  these  knowledge  management  systems. 

Environmental  criminology 

The  basic  tenet  in  environmental  criminology  is  that  place  influences 
crime.  In  other  words,  the  location  of  criminal  events  is  not  random  in 
space,  and  the  structure  of  the  patterns  of  these  events  can  be  linked  to 
characteristics  of  the  places  where  they  occur,  the  places  where  the  victims 
live  and/or  the  locations  of  the  perpetrators.  In  the  criminology  literature, 
two  main  theoretical  frameworks  have  been  developed  to  account  for  this. 
In  one,  termed  “routine  activities  theory”  or  “crime  pattern  theory”  (Cohen 
and  Felson  1979,  Brantingham  and  Brantigham  1981, 1984,  Felson  1994), 
the  crime  generating/crime  attracting  activities  of  places  are  viewed  as  the 
central  mechanism  that  brings  both  suitable  targets  and  motivated  offend¬ 
ers  together  in  time  and  space.  In  the  other,  referred  to  as  “social  disor¬ 
ganization,”  it  is  the  local  social  and  economic  conditions  of  neighbor¬ 
hoods  and  the  lack  of  local  social  control  (collective  efficacy)  that  creates 
conditions  for  elevated  criminal  behavior  (e.g.,  based  on  the  early  findings 
of  the  “Chicago  School”  and  more  recently  in  the  work  of  Sampson  et  al., 
such  as  Sampson  et  al.  1997,  2002). 

Following  the  crime  pattern  theory,  it  is  the  daily  routines  of  offenders  in 
particular  that  are  worthy  of  consideration.  Accordingly,  the  places  where 
offenders  live,  work,  and  play,  and  the  pathways  they  follow  to  move 
around  will  help  to  explain  geographic  offending  patterns.  On  the  other 
hand,  the  social  disorganization  theory  would  stress  that  high  crime 
neighborhoods  are  typically  distinguished  by  poverty,  residential  instabil¬ 
ity,  population  heterogeneity,  and  family  disruption.  These  neighborhoods 
have  little  social  cohesion  and  are  marred  by  physical  disorder;  they  are 
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littered  with  trash,  vacant  and  abandoned  buildings,  graffiti,  and  other 
signs  of  neglect.  It  is  precisely  in  these  types  of  neighborhoods  that  crime 
“hot  spots”  most  often  emerge. 

These  theoretical  frameworks  suggest  that  attention  to  space  and  place  is 
warranted  when  trying  to  understand  why  violent  events  occur  where  they 
do.  In  the  context  of  violent  acts  committed  by  insurgents,  this  suggests  a 
number  of  potential  aspects  that  should  be  taken  into  account.  For  exam¬ 
ple,  routine  activity  would  suggest  that  the  places  where  people  gather 
(e.g.,  markets)  and  the  routes  they  follow  (e.g.,  routes  followed  by  military 
convoys)  suggest  more  likely  locations  for  attacks.  Similarly,  neighbor¬ 
hoods  that  have  become  socially  dysfunctional  and  that  lack  cohesion 
would  be  potential  “hot  spots.”  Paralleling  efforts  in  the  spatial  analysis  of 
crime,  such  a  study  of  insurgent  violence  would  move  from  the  exploration 
to  the  explanation  of  patterns,  leading  to  models  that  can  be  used  as  part 
of  a  knowledge  system  supporting  policing  and  counterinsurgency. 

A  particularly  relevant  subset  of  crime  analysis  pertains  to  the  study  of 
gangs.  In  many  respects,  groups  of  insurgents  share  characteristics  with 
gangs,  and  could  be  studied  using  conceptual  and  methodological  frame¬ 
works  that  have  been  applied  to  gangs.  Important  aspects  of  these  are  the 
concept  of  micro  locations,  or  “set  space”  where  gangs  tend  to  locate  (Tita 
et  al.  2005)  and  patterns  of  spatial  diffusion  of  gang  activity  (Cohen  and 
Tita  1999,  Tita  and  Cohen  2004).  A  particularly  promising  approach  is  the 
combination  of  concepts  from  spatial  interaction  with  concepts  of  network 
interaction  (network  or  link  analysis)  in  attempting  to  understand  how  the 
spatial  imprint  of  gang  activity  matches  their  social  interaction  (Tita  2007, 
Tita  and  Ridgeway  2007).  An  illustration  of  the  incorporation  of  insights 
from  a  spatial  analysis  into  a  gang  intervention  operation  is  given  in  Tita  et 
al.  (2003)  for  a  case  study  in  Los  Angeles. 

Data  integration 

Data  from  different  sources  need  to  be  integrated  into  an  operational  deci¬ 
sion  support  system.  In  the  context  of  counterinsurgency  operations,  a  dis¬ 
tinction  can  be  made  between  data  on  violent  incidents  (IED  explosions, 
mortar  attacks,  riots)  and  data  characterizing  the  “context”  of  these  inci¬ 
dents  (socio-demographic  information  on  neighborhoods,  physical  charac¬ 
teristics,  base  maps,  etc.). 


Observations  on  incidents  are  typically  geocoded  as  point  locations  (i.e., 
their  coordinates  or  latitude-longitude  are  recorded),  although  the  preci- 
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sion  of  the  location  may  vary  with  the  type  of  incident.  For  example,  in 
some  instances,  only  a  vague  reference  to  a  particular  location  may  be 
given,  which  precludes  the  use  of  point  pattern  analysis  per  se.  Instead, 
analysis  would  have  to  be  carried  out  at  a  spatially  aggregated  level.  In 
contrast,  observations  on  the  cultural  characteristics  that  may  be  used  as 
explanatory  variables  for  the  patterns  may  be  point  locations  (e.g.,  the  lo¬ 
cation  of  physical  facilities,  such  as  bridges,  religious  buildings,  police  sta¬ 
tions)  or  they  may  only  be  available  at  a  spatially  aggregate  level,  such  as  a 
neighborhood  or  a  military  grid  (e.g.,  measures  extracted  from  various  text 
documents  on  commercial  activity,  number  of  jobs  created,  political  activ¬ 
ity,  ethnic  makeup).  In  addition,  the  spatial  sampling  of  such  data  may  be 
incomplete,  requiring  the  application  of  spatial  interpolation  to  obtain  full 
coverage  of  the  area  of  interest. 

The  effect  of  geocoding  errors  on  the  results  of  spatial  analysis  has  re¬ 
ceived  some  attention  in  the  literature,  primarily  in  the  context  of  protect¬ 
ing  the  privacy  of  medical  records.  In  such  instances,  the  original  data  are 
often  perturbed  (e.g.,  randomly  moved  about,  or  “jiggled”)  or  aggregated 
to  a  larger  scale  areal  unit  (e.g.,  sums  of  events  by  neighborhood,  rather 
than  individual  addresses).  A  few  studies  have  formally  addressed  how 
this  affects  the  power  of  statistical  tests  and/or  the  quality  of  the  coeffi¬ 
cient  estimates  obtained.  It  is  typically  found  that  greater  perturbation  or 
aggregation  lowers  the  power  of  tests.  Examples  are  the  study  of  the  effect 
of  aggregation  on  the  power  of  cluster  tests  in  Jacquez  and  Waller  (2000), 
and  on  inference  based  on  the  much  used  scan  statistic,  as  in  Armstrong  et 
al.  (1999),  Cassa  et  al.  (2006),  and  Olson  et  al.  (2006).  The  effect  of  mask¬ 
ing  on  kriging  interpolation  and  spatial  autocorrelation  analyses  is  ad¬ 
dressed  in  Gabrosek  and  Cressie  (2002)  and  Cressie  and  Kornak  (2003). 
Again,  not  surprisingly,  the  quality  of  the  statistical  inference  deteriorates 
with  a  decreased  precision  of  the  locations  used  as  inputs.  A  recent  study 
by  Zimmerman  and  Pavlik  (2008)  confirms  how  multiple  masked  versions 
of  the  data  and  mask  metadata  affect  the  estimates  of  parameters  in  a  clus¬ 
tered  Poisson  process.  Further  work  in  this  area  is  needed  to  obtain  gen¬ 
eral  guidelines  for  use  in  operational  settings. 

A  second  important  issue  pertaining  to  data  integration  is  the  combination 
of  observations  at  different  spatial  scales.  This  is  referred  to  as  the  change 
of  support  problem  (Gotway  and  Young  2002).  A  number  of  solutions 
have  been  proposed  in  the  statistical  literature,  ranging  from  interpolating 
to  a  common  aggregate  frame  to  the  combination  of  different  spatial  scales 
through  hierarchical  Bayesian  modeling  (e.g.,  Banerjee  et  al.  2004,  Chap- 
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ter  6).  An  important  methodological  question  is  how  sensitive  the  result¬ 
ing  inference  is  to  decisions  made  about  spatial  scale  and  aggregation.  A 
number  of  case  studies  address  this  in  the  context  of  spatial  analysis  of 
crimes  and  vehicle  accidents  (e.g.,  Thomas  1996,  Wang  2005a).  In  addi¬ 
tion,  the  cultural  variables  collected  are  likely  to  be  of  different  quality  and 
precision,  some  being  only  vague  estimates  or  categories.  This,  in  turn,  will 
affect  the  precision  of  the  end  result. 

Knowledge  management  systems 

The  ultimate  objective  of  the  spatial  analysis  of  insurgent  violence  is  a  de¬ 
cision  support  system  that  can  be  used  in  day  to  day  planning  (actionable 
intelligence).  The  design  of  such  systems  has  received  considerable  atten¬ 
tion  in  crime  analysis  and  several  systems  are  currently  in  operational  use 
by  the  police  departments  of  larger  metropolitan  areas.  A  well  known  ex¬ 
ample  is  the  so-called  COPLINK  system,  which  consists  of  software  tools 
that  extract  information  from  various  records  and  reports,  combine  data 
from  different  sources,  discover  patterns  and  implement  link  analysis  and 
visualization  in  near  real  time  (e.g.,  Chen  et  al.  2003,  Chung  et  al.  2005, 
Xiang  et  al.  2005,  Zhao  et  al.  2006). 

Gottschalk  (2006)  outlines  a  conceptual  framework  and  taxonomy  of 
knowledge  management  systems  in  support  of  crime  analysis.  He  outlines 
four  stages  with  increasing  sophistication,  moving  from  general  Informa¬ 
tion  Technology  (IT)  support  (such  as  spreadsheets),  to  information  about 
knowledge  sources  (such  as  intranets),  information  representing  knowl¬ 
edge  (such  as  a  data  base,  geodemographic  profiles)  and  ending  up  with  an 
expert  system.  The  latter  constitutes  a  complex  knowledge  system  that 
take  advantage  of  artificial  intelligence  to  connect  observed  patterns  to 
real  time  actions.  Gottschalk  (2006)  coins  the  four  stages  as  “officer-to- 
technology  systems,”  “officer-to-officer  systems,”  “officer-to-information 
systems,”  and  “officer-to-application  systems.”  A  similar  taxonomy  can  be 
used  to  aid  in  the  design  of  knowledge  management  systems  to  support 
counterinsurgency,  taking  into  account  the  special  nature  of  information 
gained  through  various  intelligence  systems  (human  intelligences,  sensors, 
etc.)  and  the  different  degrees  of  reliability  of  the  data. 
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4  Exploratory  Spatial  Data 
Analysis  of  Crime 

Arguably  the  first  stage  in  a  spatial  analysis  of  crime  is  the  exploratory 
stage.  Exploratory  data  analysis  (EDA)  is  a  branch  of  statistics  started  by 
John  Tukey  (1977),  and  stresses  an  inductive  approach.  As  spelled  out  by 
the  statistician  I.J.  Good  (1983),  it  is  a  collection  of  techniques  used  to  dis¬ 
cover  potentially  explicable  patterns.  The  emphasis  is  on  discovery  of  in¬ 
teresting  patterns,  which  may  be  amenable  to  explanation,  but  the  expla¬ 
nation  itself  is  not  part  of  EDA.  EDA  consists  of  many  different  graphical 
devices,  such  as  charts,  tables,  graphs,  and  maps.  These  are  referred  to  as 
views  of  the  data,  facilitating  interactive  discovery  through  a  combination 
of  graphical  representations  and  summaries  (Buja  et  al.  1996). 

Exploratory  Spatial  Data  Analysis  (ESDA)  is  a  superset  of  EDA  that  is  fo¬ 
cused  on  the  spatial  aspects  of  the  data  (Anselin  1999).  This  includes  de¬ 
scribing  spatial  distributions,  identifying  atypical  spatial  observations 
(spatial  outliers,  as  distinct  from  regular  outliers),  discovering  patterns  of 
spatial  association  (spatial  autocorrelation)  and  suggesting  spatial  regimes 
(spatial  heterogeneity). 

The  techniques  reviewed  in  this  chapter  are  organized  into  four  groups: 

1.  General  crime  mapping  and  geovisualization 

2.  Traditional  point  pattern  analysis 

3.  Hot  spot  detection 

4.  Space-time  exploration. 

Crime  mapping 

In  the  late  1990s  and  early  21st  century,  the  use  of  computerized  crime 
mapping  saw  an  explosive  growth,  reflected  in  several  books  and  edited 
volumes  devoted  to  the  topic. 

Early  examples  include  Block  et  al.  (1995),  Eck  and  Weisburd  (1995), 
Weisburd  and  McEwen  (1997),  LaVigne  and  Wartell  (1998),  and  Harries 
(1999).  Increasingly,  the  traditional  mapping  (choropleth  maps)  and  basic 
spatial  analysis  operations  (buffering,  distance  measures)  are  viewed  as 
integral  parts  of  a  geographic  information  system  and  extended  with  more 
sophisticated  (statistical)  techniques  to  identify  hot  spots,  highlight  out- 
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liers,  and  suggest  patterns,  as  argued  in  Anselin  et  al.  (2000).  Extensive 
illustrations  can  be  found  in  Block  (2000),  Goldsmith  et  al.  (2000),  LaVi- 
gne  and  Wartell  (2000),  Hirschfield  and  Bowers  (2001),  Leipnik  and  Al¬ 
bert  (2003),  Boba  (2005),  Chainey  and  Ratcliffe  (2005),  Eck  et  al.  (2005), 
Wang  (2005b)  and  Ratcliffe  (2006). 

Basic  geographic  information  system  (GIS)  use  and  computerized  maps 
have  become  so  standard  in  crime  analysis  that  they  will  not  be  elaborated 
on  here.  Some  specialized  maps  warrant  a  brief  mention,  however.  For  ex¬ 
ample,  in  Poulsen  and  Kennedy  (2004),  so-called  “dasymetric  maps”  are 
used  to  depict  the  spatial  distribution  of  burglaries  in  an  urban  area.  These 
maps  use  additional  GIS  layers  (such  as  housing  units  and  land  use)  as  a 
filter  to  constrain  the  area  of  administrative  areal  units  to  reflect  more  re¬ 
alistic  locations  for  the  crimes.  In  other  words,  these  maps  provide  a  com¬ 
promise  between  assigning  the  same  rate  to  the  full  administrative  unit 
(the  standard  approach  in  choropleth  mapping)  and  depicting  the  individ¬ 
ual  point  locations.  This  is  especially  useful  when  the  latter  are  not  avail¬ 
able  and  it  avoids  the  potentially  misleading  effect  of  the  area  and  arbi¬ 
trary  boundaries  of  administrative  units  typical  of  choropleth  maps. 
Additional  statistical  maps  can  be  used  to  avoid  this  problem,  such  as  car- 
tograms,  animation  and  conditional  maps  (cf.,  Anselin  et  al.  2006).  Also, 
specialized  outlier  maps  can  be  employed  to  highlight  locations  with  un¬ 
usually  high  values  (Anselin  1999,  Anselin  et  al.  2004),  or  to  identify  sharp 
gradients  in  crime  rates,  i.e.,  so-called  spatial  outliers.  For  example,  in 
Harries  (2006)  neighborhoods  (census  block  groups)  are  identified  where 
high  quintile  values  are  adjacent  to  low  quintile  values,  suggesting  an  ex¬ 
treme  crime  gradient. 

Additional  methods,  where  the  GIS  and  mapping  are  combined  with  pat¬ 
tern  analysis,  hot  spot  detection,  and  surveillance  are  treated  in  the  next 
sections. 

Pattern  analysis 

In  this  report,  pattern  analysis  is  used  to  designate  traditional  descriptive 
and  exploratory  methods  of  statistical  spatial  point  analysis  as  well  as  data 
mining  techniques  that  have  evolved  from  the  computer  science  literature. 
It  is  distinguished  from  hot  spot  detection  (p  13),  where  the  focus  is  on  the 
use  of  indicators  for  spatial  autocorrelation  and  clustering  methods  to 
identify  regions  of  elevated  crime  incidence  or  risk. 
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Statistical  analysis  of  point  pattern 

Descriptive  statistics  for  point  patterns  include  mean  and  median  location 
and  the  standard  deviational  ellipse,  which  give  an  indication  of  the  cen¬ 
tral  tendency  in  the  spatial  distribution  of  the  points  and  the  spread  and 
orientation  of  points  around  this  center.  These  methods  have  been  imple¬ 
mented  in  the  widely  adopted  CrimeStat  software  package  (Levine  2006, 
2007)  as  well  as  in  a  number  of  other  software  tools  and  have  been  used  in 
many  applications.  For  example,  LeBeau  (1987)  applied  this  technique  to 
track  the  changes  in  the  spatial  pattern  of  rapes.  These  methods  can  also 
be  readily  incorporated  into  a  GIS  system  in  support  of  policing  actions 
(e.g.,  to  track  the  spatial  dynamics  of  911  calls). 

A  more  refined  technique  to  describe  the  spatial  distribution  of  points  is 
kernel  smoothing,  which  creates  a  smooth  surface  representing  the  density 
of  the  points.  In  essence,  this  is  a  weighted  moving  average  of  the  count  of 
points  within  a  circle  of  a  given  bandwidth,  where  the  weights  are  given  by 
the  chosen  kernel  function  (for  detailed  illustration,  see,  e.g.,  Levine 
2007).  Some  examples  of  the  application  to  spatial  crime  analysis  are 
Steenberghen  et  al.  (2004)  who  use  it  to  describe  the  distribution  of  road 
accidents,  and  Corcoran  et  al.  (2007)  who  include  it  into  their  review  of 
spatial  analytical  methods  applied  to  the  study  of  fires. 

Perhaps  the  most  commonly  used  statistic  to  assess  the  absence  of  com¬ 
plete  spatial  randomness  in  a  point  pattern  is  Ripley’s  (1976)  K  function. 
The  K  function  focuses  on  so-called  second  order  properties  of  a  point  pat¬ 
tern,  which  are  similar  to  the  notion  of  a  covariance.  The  first  order  char¬ 
acteristic  is  simply  the  intensity  of  the  process,  or  the  average  number  of 
points  per  unit  area,  for  example,  as  summarized  in  a  kernel  density  func¬ 
tion.  The  second  order  characteristic  is  then  some  measure  of  covariance 
between  intensities  at  different  locations.  More  precisely,  the  K  function  is 
the  ratio  of  the  expected  number  of  additional  events  within  a  given  dis¬ 
tance  from  an  arbitrary  event  to  the  intensity  of  the  process.  It  is  readily 
calculated  by  counting  the  number  of  points  within  an  increasing  radius 
from  each  event  in  the  pattern.  It  is  typically  computed  for  a  number  of 
distance  ranges  and  plotted  against  distance.  It  is  included  as  a  function  in 
the  CrimeStat  software  and  has  seen  many  applications  (see  also  Anselin 
et  al.  2008). 

While  the  K  function  focuses  on  the  overall  patterning  of  points  (“cluster¬ 
ing”),  interest  often  centers  on  specific  locations  of  “clusters.”  As  such,  the 
K  function  is  not  able  to  provide  this  information.  An  extension  of  the  no- 
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tion  of  local  indicators  of  spatial  association  (Anselin  1995)  to  identify  lo¬ 
cal  clusters  by  means  of  the  differential  of  the  K  function,  the  so-called 
product  density  function,  is  advanced  in  Cressie  and  Collins  (2001a,  b).  A 
slightly  different  approach  was  recently  presented  in  Mateu  et  al.  (2007). 

One  limitation  of  the  K  function  as  traditionally  applied  is  that  it  is  best 
suited  for  a  situation  of  an  isotropic  plane,  in  which  an  event  can  be  lo¬ 
cated  anywhere.  However,  in  practice,  there  are  often  limitations  to  the 
possible  locations.  For  example,  when  events  occur  on  a  street  network, 
the  space  in  between  the  network  links  and  nodes  becomes  impossible  as  a 
location.  Recent  work  by  Okabe  and  co-workers  has  extended  the  K  func¬ 
tion  to  events  on  a  network,  using  shortest  path  distances  on  the  network 
instead  of  the  traditional  omni-directional  “as  the  crow  flies”  distance.  The 
basic  methodology  was  established  in  a  series  of  papers  by  Okabe  et  al. 
(1995))  Okabe  and  Kitamura  (1996),  Okabe  and  Yamada  (2001),  and 
Okabe  and  Satoh  (2005),  and  it  has  been  implemented  in  the  SANet  tool¬ 
box  for  spatial  analysis  on  a  network  (Okabe  et  al.  2006a,  b). 

The  network  K  function  has  seen  applications  in  a  number  of  areas,  such 
as  the  location  of  acacia  plants  (Spooner  et  al.  2004)  and  accidents  on  a 
road  network  (Yamada  and  Thill  2004;  e.g.,  contrast  with  a  traditional  K 
function  analysis  of  traffic  accidents  in  Jones  et  al.  1996).  Yamada  and 
Thill  (2004)  also  carry  out  a  comparison  of  the  results  of  the  traditional 
(planar)  K  analysis  with  the  network  K  function.  Similarly,  in  Lu  and  Chen 
(2007),  the  results  of  a  planar  and  network  K  are  compared  for  urban 
crime  on  a  street  network.  The  planar  K  tends  to  result  in  false  positives 
for  a  less  dense  street  network  and  low  crime  density;  in  contrast,  dense 
street  and  dense  crime  lead  to  more  false  negatives.  In  other  words,  the 
performance  of  the  network  K  function  relative  to  the  planar  K  is  related  to 
the  structure  of  the  street  network  and  the  density  of  point  events.  Further 
work  is  needed  to  establish  the  degree  of  generality  of  the  findings  in  this 
case  study. 

Data  mining 

Parallel  to  the  attention  paid  to  pattern  recognition  from  a  statistical  view¬ 
point,  developments  in  computer  science  have  yielded  methods  of  ma¬ 
chine  learning  and  knowledge  discovery  that  are  designed  to  recognize 
patterns  in  multivariate  data  sets.  In  crime  analysis,  this  begins  with 
automatic  information  extraction  from  various  records  and  incident  re¬ 
ports  and  the  application  of  machine  learning  (such  as  text  mining)  and 
rule-based  expert  systems  to  ultimately  yield  an  operational  decision  sup- 


ERDC/CERL  TR-09-13 


13 


port  system.  A  recent  overview  of  the  application  of  data  and  text  mining 
in  crime  analysis,  with  an  emphasis  on  risk  and  threat  assessment,  and  the 
use  of  predictive  analytics  to  obtain  operationally  actionable  output  is 
given  by  McCue  (2007).  Discussions  of  different  approaches  can  also  be 
found  in  Brown  and  Hagen  (2003),  Chen  et  al.  (2003),  and  Yang  and  Li 
(2007).  Arguably  the  best-known  system  in  operational  use  to  date  is  the 
COPLINK  system  referred  to  in  Section  B.3. 

Hot  spot  detection 

Specialized  techniques  for  the  detection  of  hot  spots  follow  a  number  of 
different  logics.  Three  different  categories  are  distinguished  here:  scan  sta¬ 
tistics,  methods  based  on  spatial  autocorrelation  statistics,  and  generic 
cluster  detection  techniques.  They  are  briefly  reviewed  in  turn. 

Scan  statistics 

So-called  scan  statistics  consist  of  counting  the  number  of  events  in  a 
geometric  shape  (usually  a  circle)  and  comparing  those  to  a  reference  pat¬ 
tern  of  spatial  randomness.  Early  examples  are  the  Geographical  Analysis 
Machine  (GAM)  of  Openshaw  et  al.  (1987),  and  the  space-time  analysis  of 
crime  (STAC)  of  Block  (1995,  2000).  Both  of  these  methods  consist  of 
counting  the  number  of  points  in  a  series  of  overlapping  circles  and  label¬ 
ing  them  as  significant  when  the  observed  count  is  extreme  relative  to  a 
reference  distribution  of  simulated  spatially  random  points.  The  STAC 
method  is  implemented  in  the  CrimeStat  software  package,  in  which  an 
identified  cluster  of  points  is  represented  by  their  standard  deviational  el¬ 
lipse  (see  centrography  in  c.2.1). 

These  early  scan  statistics  suffer  from  the  problem  of  multiple  compari¬ 
sons  (overlapping  circles)  and  are  sensitive  to  parameter  settings  (radius 
of  circle,  etc.).  The  Kulldorff  (1997, 1999)  scan  statistic  and  its  later  re¬ 
finements  address  some  of  these  concerns  by  using  a  likelihood  criterion 
to  identify  clusters.  In  essence,  the  scan  statistic  considers  circles  of  in¬ 
creasing  radius  and  identifies  that  circle  that  maximizes  the  probability  of 
having  events  inside  the  circle  exceeding  that  outside  the  circle.  Kulldorff  s 
scan  statistic  is  implemented  in  the  specialized  SatScan  software  package 
(http://www.satscan.org).  A  recent  generalization  to  the  detection  of  arbitrarily 
shaped  hotspots  is  the  so-called  upper  level  set  (ULS)  scan  statistic  (Patil 
and  Taillie  2003)  and  its  extension  to  bivariate  data  contexts  in  Modarres 
and  Patil  (2007). 
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An  alternative  extension  is  the  augmentation  of  the  likelihood  idea  of  the 
scan  statistic  with  an  optimization  procedure  using  simulated  annealing  to 
detect  spatial  clusters  of  arbitrary  shape  by  Duczmal  and  Assuncao  (2004). 
This  is  applied  to  the  identification  of  clusters  in  the  spatial  distribution  of 
homicides  in  Belo  Horizonte,  Brazil. 

Spatial  autocorrelation  statistics 

A  second  broad  category  of  approaches  bases  the  identification  of  clusters 
and  spatial  outliers  on  the  results  of  a  statistical  test  for  spatial  autocorre¬ 
lation.  These  methods  pertain  to  data  that  have  been  aggregated  into  areal 
units,  such  as  administrative  units  or  artificial  grids,  so  called  lattice  data 
(contrasting  with  point  patterns).  For  example,  in  spatial  crime  analysis, 
this  often  pertains  to  the  count  of  events  by  spatial  unit,  or  to  a  rate  (the 
count  of  events  divided  by  the  population  at  risk). 

A  spatial  autocorrelation  statistic  is  a  formal  test  of  the  match  between 
value  or  attribute  similarity  and  locational  similarity.  The  statistic  summa¬ 
rizes  both  aspects  and  is  deemed  to  be  significant  if  the  probability  (p- 
value)  that  the  statistic  would  take  this  value  in  a  spatially  random  pattern 
is  extremely  low.  Measures  of  attribute  similarity  summarize  the  similarity 
(or  dissimilarity)  between  the  values  observed  at  two  locations.  Three 
popular  formal  expressions  for  this  are  the  cross  product  (as  a  measure  of 
similarity),  and  the  squared  difference  and  absolute  difference  (as  meas¬ 
ures  of  dissimilarity).  Locational  similarity  is  formalized  through  a  spatial 
weights  matrix,  which  expresses  the  notion  of  neighbor.  Spatial  weights 
are  not  necessarily  geographical,  but  can  incorporate  social  network  struc¬ 
tures  as  well  (for  a  classic  treatment  of  spatial  autocorrelation,  see  Cliff 
and  Ord  1973, 1981). 

Similar  to  the  K  function  for  point  pattern  analysis  (p  11),  a  global  spatial 
autocorrelation  statistic  (like  Moran’s  I  or  Geary’s  c)  is  not  appropriate  for 
the  identification  of  local  clusters  or  hot  spots.  To  that  end,  a  local  version 
of  the  statistics  needs  to  be  employed,  a  so-called  “Local  Indicator  of  Spa¬ 
tial  Association,”  or  LISA  (Anselin  1995).  Significant  LISA  statistics  sug¬ 
gest  locations  where  the  value  of  the  variable  of  interest  is  more  grouped 
with  that  of  its  neighbors  than  likely  under  spatial  randomness.  Therefore, 
such  locations  become  identified  as  local  clusters,  either  hot  spots  (high 
values  surrounded  by  other  high  values),  or  cold  spots  (low  values  sur¬ 
rounded  by  low  values).  Alternatively,  in  some  instances  spatial  outliers 
may  be  identified  by  significant  LISA  statistics  indicating  negative  local 
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spatial  autocorrelation,  where  low  values  are  surrounded  by  high  values, 
or  vice  versa. 

A  commonly  used  LISA  statistic  is  the  local  Moran,  a  location-specific  ver¬ 
sion  of  the  familiar  Moran’s  I  statistic  for  spatial  autocorrelation  (Anselin 
1995).  This  has  been  applied  to  the  identification  of  high  homicide  county 
clusters  in  Messner  et  al.  (1999),  for  example  (for  more  extensive  over¬ 
views,  see  also  Messner  and  Anselin  2004  and  Anselin  et  al.  2008).  A  re¬ 
lated  application  is  to  the  identification  of  so-called  black  zones,  or  road 
segments  that  exhibit  an  extreme  number  of  vehicle  accidents  (for  an  early 
approach,  see  Black  and  Thomas  1998).  Local  Moran  statistics  are  used  to 
identify  significant  concentrations  of  high  accident  numbers  in  Flahaut  et 
al.  (2003)  and  Steenberghen  et  al.  (2004)  (see  also  Geurts  et  al.  2004,  for 
an  assessment  of  methods  to  identify  and  rank  black  zones).  A  related  ap¬ 
proach  is  the  extension  of  the  network  K  function  and  the  LISA  statistic  to 
local  indicators  of  network  constrained  clusters  (LINCS)  in  Yamada  and 
Thill  (2007).  This  is  also  used  to  identify  segments  on  a  road  network  with 
elevated  numbers  of  vehicle  crashes. 

A  slightly  different  local  statistic  is  the  Gi  (and  Gi*)  test  developed  by  Getis 
and  Ord  (1992)  (see  also  Ord  and  Getis  1995).  Similar  to  the  local  Moran, 
this  statistic  identifies  locations  of  local  hot  spots  and  local  cold  spots  (but 
not  spatial  outliers).  It  has  been  applied  to  the  study  of  burglaries  in  urban 
areas  by  Craglia  et  al.  (2000).  Interestingly,  in  that  study,  the  Gi  statistic  is 
compared  to  the  more  traditional  STAC  approach  and  found  to  be  superior 
in  identifying  true  clusters.  Ratcliffe  and  McCullah  (1999)  use  the  Gi  statis¬ 
tic  in  combination  with  a  global  moving  window  to  distinguish  between 
hotspots  and  hotbeds  in  residential  burglary  and  motor  vehicle  crime. 

They  suggest  that  some  of  the  problems  caused  by  the  modifiable  areal 
unit  problem  (MAUP)  are  avoided  by  changing  the  search  area  of  the  mov¬ 
ing  window. 

Local  spatial  autocorrelation  measures  are  included  in  the  software  GeoDa 
(Anselin  et  al.  2006),  Space-Time  Analysis  of  Regional  Systems  (STARS) 
(Rey  and  Janikas  2006),  CrimeStat  (Levine  2006),  the  ArcGIS  spatial  sta¬ 
tistics  toolbox,  the  open  source  R  spdep  (spatial  dependence)  package,  as 
well  as  several  others. 

Generic  cluster  detection 

A  third  category  of  methods  to  detect  hot  spots  uses  heuristic  methods 
from  the  discipline  of  operations  research  to  construct  clusters  of  areas 
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that  are  similar  with  respect  to  some  characteristic.  These  techniques  can 
be  applied  to  individual  points  or  to  aggregate  spatial  units.  Specifically, 
clusters  are  formed  such  that  the  similarity  of  the  cluster  members  within 
the  same  cluster  is  greater  than  between  clusters.  Similarity  can  be  based 
on  distance  or  on  a  multivariate  characterization  (as  in  k-means  cluster¬ 
ing).  Applications  of  these  techniques  to  urban  crime  in  Queensland  are 
illustrated  in  Murray  et  al.  (2001)  (see  also  Murray  and  Estivill-Castro 
1998). 

A  recent  article  by  Grubesic  (2006)  suggests  that  fuzzy  clustering  tech¬ 
niques  may  be  superior  in  some  respects  relative  to  the  standard  hierar¬ 
chical  clustering  techniques.  Such  fuzzy  methods  do  not  yield  “hard”  mem¬ 
bership  in  each  partition,  but  instead  yield  a  degree  of  fuzziness.  This 
creates  some  challenges  for  the  visualization  of  the  results,  e.g.,  by  means 
of  membership  probability  surfaces.  Grubesic  (2006)  illustrates  this  with 
an  application  to  crime  events  in  a  neighborhood  in  Cincinnati,  Ohio.  Re¬ 
lated  approaches  are  so-called  contiguity-constrained  clustering  methods 
(Duque  et  al.  2007a,  b),  where  it  is  guaranteed  that  the  identified  clusters 
consist  of  connected  spatial  units,  which  is  not  always  the  case  when  using 
standard  clustering  algorithms,  such  as  the  k-means  clustering  contained 
in  CrimeStat. 

Space-time  exploration 

Many  techniques  to  explore  patterns  that  occur  both  across  space  and  over 
time  are  straightforward  generalizations  of  pure  cross-sectional  methods. 
For  example,  the  scan  statistic  (C.3.1)  can  be  extended  to  identify  space- 
time  clusters,  the  local  Moran  statistic  (C.3.2)  can  be  applied  to  compare 
patterns  of  occurrence  with  that  of  neighbors  at  a  different  point  in  time, 
etc.  The  research  question  at  hand  is  very  similar  to  that  employed  in  epi¬ 
demiological  studies  of  the  spread  of  disease.  In  spatial  crime  analysis,  the 
counterpart  of  this  is  the  notion  that  the  risk  of  a  particular  criminal  event 
spreads  over  time  to  nearby  locations.  Space-time  exploration  has  many 
commonalities  with  the  surveillance  and  forecasting  methods  discussed  in 
Section  E.  The  distinction  between  the  two  categories  is  admittedly  some¬ 
what  arbitrary.  In  Section  E,  the  emphasis  is  on  methods  that  have  been 
expressly  presented  in  a  context  of  surveillance  and  forecasting,  whereas 
the  methods  covered  here  are  more  in  an  exploratory  vein,  without  neces¬ 
sarily  being  used  in  an  explicit  surveillance  context. 

A  commonly  used  procedure  inspired  by  the  statistical  point  pattern  litera¬ 
ture  is  the  Knoxtest  to  identify  space-time  clusters  (see  Diggle  2003).  For 
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example,  this  was  applied  in  a  wide  ranging  comparison  of  space-time  pat¬ 
terns  in  burglaries  across  10  urban  areas  (Johnson  et  al.  2007).  A  similar 
extension  is  the  random  point  nearest  neighbor  technique  of  Ratcliffe 
(2005),  which  is  applied  to  the  change  in  the  spatial  distribution  of  bur¬ 
glaries  in  Canberra,  Australia. 

A  related  interest  in  the  study  of  the  “contagion”  of  crime  risk  is  whether 
some  type  of  displacement  may  occur,  particularly  due  to  a  previous  police 
intervention.  This  focus  on  displacement  is  the  topic  of  a  number  of  ef¬ 
forts,  such  as  the  so-called  aoristic  signatures  of  Ratcliffe  (2000,  2002) 
and  the  weighted  displacement  quotient  of  Bowers  and  Johnson  (2003). 
Aoristic  signatures  are  a  method  to  deal  with  the  imprecision  in  the  re¬ 
corded  time  of  the  criminal  event.  A  temporal  weight  is  constructed  to  re¬ 
flect  the  probability  that  an  event  occurred  in  a  given  period.  These 
weights  can  be  attached  to  the  spatial  locations  of  the  events  and  yield  dif¬ 
ferent  visualizations  (e.g.,  the  cylinders  used  in  Ratcliffe  2000)  and  surface 
representations.  The  weighted  displacement  quotients  uses  a  similar  ra¬ 
tionale  as  local  space-time  autocorrelation  quotients  in  that  changes  in  the 
crime  rate  in  a  buffer  zone  are  examined  around  the  original  location  of 
criminal  events.  This  yields  some  sort  of  location  quotient  that  incorpo¬ 
rates  a  measure  of  change  over  time  (see  Bowers  and  Johnson  2003). 

Other  approaches  consist  of  creative  extensions  of  cartographic  techniques 
to  capture  the  spatial  dynamics  of  criminal  events.  As  reviewed  in  Brand- 
son  et  al.  (2007)  exploratory  space-time  visualization  can  be  carried  out  by 
means  of  map  animation,  creative  use  of  so-called  comaps,  isosurfaces, 
and  linked  plots. 

Griffith  and  Chavez  (2004)  use  an  innovative  combination  of  local  spatial 
autocorrelation  statistics  (i.e.,  ESDA)  with  the  trajectory  method  proposed 
by  Nagin  (1999)  to  study  the  space-time  dynamics  of  crime  in  Chicago 
neighborhoods.  Applying  the  trajectory  method  to  the  crime  patterns  over 
time  for  each  neighborhood  studied  yields  a  grouping  of  neighborhoods  by 
trajectory  type.  This  is  then  examined  by  means  of  local  spatial  autocorre¬ 
lation  statistics  to  assess  the  extent  to  which  neighborhoods  with  similar 
trajectories  also  cluster  in  space. 

A  similarly  creative  combination  of  techniques  is  the  use  of  circular  statis¬ 
tics  to  compare  the  dynamics  of  criminal  events  outlined  in  Brundson  and 
Corcoran  (2006).  The  circular  statistics  (originally  developed  to  analyze 
directional  patterns)  are  adapted  to  assess  and  model  geographical  pat- 
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terns  in  the  daily  cycles  of  events.  Specifically,  Brunsdon  and  Corcoran 
(2006)  apply  this  to  study  criminal  damage  in  the  city  of  Cardiff,  Wales 
and  use  a  kernel  smoothing  technique  to  visually  represent  the  distribu¬ 
tion  by  time  of  day.  This  is  then  applied  to  a  geographical  comparison  be¬ 
tween  the  city  center  and  the  rest  of  the  city. 
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5  Explanatory  Modeling  of  Crime 

Explanatory  modeling  of  crime  moves  beyond  exploration  and  identifica¬ 
tion  of  patterns  to  the  modeling  of  crime  event  counts,  rates,  or  risk  as  a 
function  of  explanatory  variables,  or  covariates.  The  covariates  are  typi¬ 
cally  suggested  by  theoretical  frameworks  in  environmental  criminology 
(B.i)  and  include  characteristics  of  the  perpetrator,  victim,  the  location 
where  the  event(s)  happened,  and  the  environmental  context.  In  this  sec¬ 
tion,  these  approaches  are  classified  into  three  broad  categories: 

1.  Traditional  regression  modeling,  where  the  crime  event  is  on  the  left  hand 
side  of  an  equation  and  the  covariates  are  on  the  right  hand  side 

2.  A  special  case  of  regression  modeling,  where  the  focus  is  on  repeat  offend¬ 
ers  and  the  use  of  geographic  characteristics  (such  as  distance  to  the  event) 
to  model  the  probability  of  an  additional  event  occurring  in  a  particular  lo¬ 
cation 

3.  A  brief  review  of  simulation  approaches  in  the  form  of  agent-based  mod¬ 
els. 

Regression  models 

The  environmental  tradition  in  criminology  has  yielded  a  vast  number  of 
regression  analyses  where  the  rate  of  one  or  more  types  of  violent  crime 
(homicides,  burglaries,  etc.)  in  a  spatial  unit  of  reference  is  related  to  a  set 
of  covariates  (for  overviews,  see  Anselin  et  al.  2000;  Messner  and  Anselin 
2004).  Such  ecological  regression  has  been  carried  out  for  a  range  of  dif¬ 
ferent  spatial  scales,  such  as  neighborhoods,  census  tracts,  counties  and 
metropolitan  areas,  both  in  a  pure  cross-section  as  well  as  including  ob¬ 
servations  over  time  and  across  space. 

Covariates  commonly  consist  of  neighborhood  characteristics  (based  on 
the  social  disorganization  tradition),  such  as  socio-economic  conditions, 
deprivation,  residential  stability,  ethnicity,  education,  as  well  as  character¬ 
istics  of  locations  that  would  be  conducive  to  crime  (routine  activities), 
such  as  presence  of  (or  distance  to)  liquor  stores  and  bars.  Typically,  these 
covariates  are  extracted  from  census  sources,  sometimes  reduced  in  di¬ 
mension  by  means  of  factor  analysis  (due  to  the  high  degree  of  collinear- 
ity). 
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Apart  from  various  empirical  applications  (too  numerous  to  be  reviewed 
here),  attention  also  focuses  on  some  important  methodological  concerns, 
particularly  dealing  with  the  spatial  nature  of  the  data  (spatial  economet¬ 
rics)  and  assessing  different  estimation  methods. 

The  use  of  cross-sectional  data  for  aggregate  spatial  units  requires  an  ex¬ 
plicit  consideration  of  spatial  consideration  and  spatial  heterogeneity, 
which  is  accomplished  by  means  of  the  methodology  of  spatial  economet¬ 
rics  (Anselin  1988).  In  Bailer  et  al.  (2001),  the  importance  of  using  the 
proper  spatial  econometric  estimation  methods  is  illustrated  for  a  study  of 
homicide  rates  in  U.S.  counties.  They  use  a  classical  perspective  and  limit 
the  discussion  to  linear  regression  models  and  the  application  of  spatial 
lag  and  spatial  error  models.  A  Bayesian  perspective  is  taken  in  the  work  of 
Law  and  Haining  (2004),  where  spatial  autocorrelation  is  taken  into  ac¬ 
count  in  a  logistic  regression  through  a  random  effects  specification  in  a 
Bayesian  hierarchical  model  of  high  intensity  crime  areas  in  Sheffield, 
England.  This  extends  earlier  studies  that  used  standard  logistical  regres¬ 
sion  techniques  (Craglia  et  al.  2004,  2005).  Malczewski  and  Poetz  (2005) 
address  spatial  heterogeneity  explicitly  by  applying  the  geographically 
weighted  regression  method  (GWR,  Fotheringham  et  al.  2002)  in  a  study 
of  residential  burglaries  in  London,  Ontario.  Wang  (2005a)  focuses  on  the 
role  of  spatial  scale  and  the  associated  MAUP  by  considering  spatial  ag¬ 
gregation  at  different  scales. 

Most  studies  consider  crime  data  as  continuous  variables,  aggregated  to 
spatial  units.  In  contrast,  Osgood  (2000)  takes  into  account  the  discrete 
count  nature  of  criminal  events  through  the  application  of  Poisson  regres¬ 
sion.  Another  important  category  consists  of  studies  where  criminal  be¬ 
havior  is  conceptualized  as  a  choice  process.  For  example,  in  Xue  and 
Brown  (2003,  2006)  crime  is  analyzed  within  the  methodological  frame¬ 
work  of  discrete  choice  theory.  In  addition,  some  innovative  techniques 
are  introduced  to  proxy  unobserved  actual  choice  behavior  by  characteris¬ 
tics  of  the  environment,  such  as  distance  to  various  “features.”  In  Haynie 
et  al.  (2006)  neighborhood  characteristics  and  peer  influence  (social  net¬ 
works)  are  considered  explicitly  in  a  study  of  adolescent  violence. 

Panel  data,  i.e.,  combinations  of  observations  across  space  and  over  time, 
have  been  considered  as  well,  particularly  in  an  attempt  to  eliminate  unob¬ 
served  heterogeneity.  Most  of  these  use  the  standard  fixed  effects  or  ran¬ 
dom  effects  methods.  Particular  attention  to  methodological  issues  is  given 
by  Worrall  and  Pratt  (2004),  where  the  focus  is  on  dealing  with  unob- 
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served  heterogeneity  and  Kakamu  et  al.  (2008),  where  Bayesian  spatio- 
temporal  models  are  applied.  Phillips  and  Greenberg  (2008)  compare  sev¬ 
eral  methods  for  pooled  cross  section  and  time  series  data,  such  as  fixed 
effects  and  random  effects,  as  well  as  an  innovative  use  of  latent  growth 
curve  models. 

An  interesting  specialized  literature  deals  with  explanatory  regression 
models  for  road  accidents,  i.e.,  models  that  provide  explanation  for  the  lo¬ 
cation  of  black  zones  (road  segments  with  elevated  accident  counts).  For 
example,  Flahaut  (2004)  uses  a  logistic  regression  with  spatial  autocorre¬ 
lation  to  this  effect.  Somewhat  related  is  the  analysis  of  motor  vehicle  acci¬ 
dent  injuries  in  McNab  (2004),  where  a  Bayesian  random  component 
model  is  used  to  account  for  spatial  autocorrelation. 

Geographic  profiling 

A  special  case  of  explanatory  models  is  so-called  geographic  profiling, 
where  the  objective  is  to  derive  the  residence  of  a  serial  offender  from  the 
locations  of  the  successive  crimes  (Canter  2003,  Rossmo  2005).  The  ar¬ 
gument  is  that  offenders  are  most  likely  to  strike  within  their  own  activity 
space,  so  that  a  geographic  strategy  based  on  the  spatial  distribution  of  the 
events  or  on  the  distance  from  various  candidate  locations  to  the  crime 
events  provides  important  insights.  Geographic  profiling  has  seen  a  range 
of  application,  such  as  tracking  serial  killers  (Canter  et  al.  2000)  or  com¬ 
mercial  robberies  (Laukkanen  and  Santilla  2006).  Several  methodological 
aspects  have  received  attention,  such  as  the  effectiveness  of  decision  rules 
of  differing  complexity  (Snook  et  al.  2005)  and  the  sensitivity  of  the  dis¬ 
tance  decay  function  to  the  choice  of  distance  measure,  such  as  shortest 
distance  or  travel  time  (Kent  et  al.  2006).  The  distance  decay  function  that 
underlies  geographic  profiling  is  part  of  several  software  packages  devel¬ 
oped  to  assist  police  investigators.  This  includes  Dragnet  (Center  for  In¬ 
vestigative  Psychology,  University  of  Liverpool* *),  CrimeStat  (Levine 
2007+),  psycho  geographic  profiling  Predator  (Godwin  and  Rosen  2005), 
and  Rigel  (Environmental  Criminology  Research  Inc.*). 

The  latter  uses  a  patented  Geographic  Criminal  Targeting  (GCT)  algo¬ 
rithm.  Similar  in  spirit  to  geographic  profiling  are  threat  maps  based  on  an 
index  of  vulnerability  that  is  built  up  from  accessibility  measures  to  a 


*  http://www.ipsv.com/publications/publications  dragnet.php 

t  http://www.icpsr.umich.edu/CRIMESTAT/ 

*  http://www.ecricanada.com/rigel/index.html 
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number  of  features  in  the  landscape  (e.g.,  Suen  and  Demirci  2006;  Riese 
2006).  Again,  the  fundamental  driver  is  a  distance  decay  function,  re¬ 
flected  through  a  particular  accessibility  index  or  a  spatial  kernel  function. 

Agent-based  models 

An  alternative  approach  towards  gaining  an  understanding  of  criminal  be¬ 
havior  is  not  based  on  the  analysis  of  actual  data,  but  on  the  simulation  of 
complex  systems,  driven  by  the  behavior  of  individual  agents.  So-called 
agent-based  modeling  is  increasingly  applied  in  the  modeling  of  military 
conflicts  such  as  urban  insurgency  (Diedrich  et  al.  2003).  The  use  of  a 
multi-agent  approach  has  also  gained  acceptance  in  criminology  as  a  way 
to  obtain  insight  into  the  complex  interactions  involved  in  criminal  behav¬ 
ior,  such  as  street  robbery  or  riots  (e.g.,  Groff  2007,  Torrens  2007a,  b). 
However,  to  date,  the  computational  and  data  requirements  needed  to 
mimic  realistic  contexts  still  require  considerable  further  research. 


ERDC/CERL  TR-09-13 


23 


6  Surveillance  and  Forecasting 

Several  of  the  techniques  reviewed  under  the  heading  of  space-time  explo¬ 
ration  (c.4)  and  regression  analysis  (D.i)  have  been  and  could  be  imple¬ 
mented  as  part  of  surveillance  systems  aimed  at  detecting  important 
changes  in  patterns  over  time.  Such  systems  have  a  strong  tradition  in  epi¬ 
demiology  and  public  health  analysis,  where  they  are  used  to  detect  the 
advent  of  a  new  epidemic  or  to  identify  an  unusual  outbreak  of  a  disease. 
The  ultimate  goal  of  surveillance  methods  is  to  develop  an  automated  de¬ 
cision  support  system  that  provides  “alerts”  when  needed. 

A  number  of  point  pattern  techniques  have  been  suggested  specifically  in 
the  context  of  surveillance.  For  example,  the  spatial  scan  statistic  of  Kull- 
dorff  (2001)  can  be  readily  implemented  to  accomplish  this.  Also,  Roger- 
son  (2001)  and  Rogerson  and  Sun  (2001)  track  the  change  over  time  in  the 
spatial  pattern  of  point  events  by  combining  a  nearest  neighbor  statistic 
and  a  cumulative  sum  method.  Porter  and  Brown  (2007)  suggest  a  method 
to  detect  the  change  in  the  distribution  of  point  process  by  constructing  an 
intensity  function  that  depends  on  features  (such  as  distance  to  land¬ 
marks)  as  a  special  case  of  marked  point  pattern  analysis. 

An  alternative  perspective  is  based  on  the  time  domain  and  uses  forecast¬ 
ing  methods.  This  is  more  appropriate  in  allocating  future  crime  fighting 
resources,  for  example,  future  deployment  of  police  forces.  In  the  context 
of  a  spatial  analysis  of  crime,  forecasting  is  relevant  when  a  locational 
component  is  preserved.  To  have  sufficient  statistical  validity,  the  spatial 
units  of  analysis  will  typically  be  fairly  aggregate.  In  many  instances,  this 
precludes  a  meaningful  spatial  analysis. 

A  special  issue  of  the  Journal  of  Forecasting  (Gorr  and  Harries  2003)  con¬ 
siders  a  number  of  methodological  issues  pertaining  to  crime  forecasting, 
such  as  the  accuracy  for  small  areas  (Gorr  et  al.  2003).  A  number  of  novel 
combinations  of  techniques  are  suggested  as  well,  such  as  the  use  of  “fea¬ 
tures”  to  model  the  transition  density  between  patterns  of  events  over  time 
in  Liu  and  Brown  (2003),  and  the  combination  of  cluster  detection  with  an 
artificial  neural  network  forecasting  routine  in  Corcoran  et  al.  (2003). 
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One  common  characteristic  of  crime  forecasting  techniques  is  the  need  for 
considerable  data  points,  both  over  time  and  across  space.  This  is  not 
likely  to  be  satisfied  in  a  counterinsurgency  context. 
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7  Towards  a  Methodological  Template 

From  a  methodological  viewpoint,  the  parallel  between  the  development  of 
actionable  intelligence  for  counterinsurgency  and  the  techniques  devel¬ 
oped  for  the  spatial  analysis  of  crime  is  attractive.  However,  several  limita¬ 
tions  need  to  be  considered  before  any  of  these  methods  can  be  applied 
directly  to  a  situation  of  urban  military  conflict.  The  main  constraint  per¬ 
tains  to  the  quality  and  availability  of  the  data.  Most  methods  of  spatial 
data  analysis  are  based  on  an  assumption  of  either  a  complete  count  of 
events  (e.g.,  in  point  pattern  analysis)  or  a  well-structured  sample  (e.g., 
the  basis  for  census  data).  Neither  of  these  can  be  expected  to  necessarily 
hold  in  a  violent  conflict  situation.  As  pointed  out  in  the  report,  in  addition 
there  may  be  lack  of  information  about  the  exact  location  of  events  as  well 
as  imprecision  in  the  measurement  of  neighborhood  and  other  socio¬ 
cultural  characteristics. 

Therefore,  a  high  priority  of  research  is  to  assess  the  extent  to  which  the 
conclusions  drawn  from  the  application  of  exploratory  and  explanatory 
spatial  data  analysis  remain  reliable  under  conditions  of  imprecise  infor¬ 
mation.  This  may  lend  itself  to  the  application  of  a  Bayesian  perspective, 
where  the  uncertainty  about  both  data  and  parameters  can  be  formally  ex¬ 
pressed.  Alternatively,  simulation  experiments  may  provide  insight  into 
the  information  loss  incurred  as  a  result  of  imperfect  measurement  and 
sampling.  To  date,  the  ramifications  of  this  in  the  context  of  the  types  of 
analyses  required  here  have  not  been  explored. 

A  methodological  template  would  then  consist  of  a  three-pronged  strategy, 
going  from  exploration  of  patterns  (pattern  analysis,  data  mining)  to  the 
formulation  of  an  explanatory  model  (relating  events,  rates  or  risk  to 
socio-cultural  covariates)  and  the  incorporation  of  these  into  a  decision 
support  system.  A  major  focus  of  attention  would  be  to  identify  those  types 
of  events  and  those  socio-cultural  characteristics  that  can  be  extracted 
from  non-traditional  data  sources  (e.g.,  text  mining  of  news  reports).  In  a 
first  stage  of  analysis,  these  variables  can  be  turned  into  indicators,  catego¬ 
ries  or  indexes  that  could  be  mapped,  and  whose  pattern  structure  could 
be  followed  over  time.  In  a  second  stage  of  analysis,  those  variables  could 
be  included  as  covariates  in  an  explanatory  model  to  provide  the  basis  for 
surveillance  and/or  forecasting  analysis. 
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Acronyms  and  Abbreviations 


Term 

ACUSTO 

ANSI 

ASAALT 

CERL 

COIN 

DC 

EDA 

ERDC 

ESDA 

GAM 

GOT 

GIS 

GWR 

IAT 

IED 

IEEE 

IPB 

IT 

LINOS 

LISA 

MAUP 

MDMP 

NSN 

OMB 

SANet 

STAC 

STARS 

TR 

UK 

ULS 

VISTA 


Spellout 

Actionable  Cultural  Understanding  for  Support  to  Tactical  Operations 
American  National  Standards  Institute 

Assistant  Secretary  of  the  Army  for  Acquisition,  Logistics,  and  Technology 

Construction  Engineering  Research  Laboratory 

Counterinsurgency 

District  Of  Columbia 

Exploratory  Data  Analysis 

Engineer  Research  and  Development  Center 

Exploratory  Spatial  Data  Analysis 

Geographical  Analysis  Machine 
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